Method and system for managing personal information within independent computer systems and digital networks

ABSTRACT

A system and method for reliably and securely recording and storing all attributes of personal identification, for the identification and authorization of individual identity as well as attributes relating to it and personal data including but not limited to individual&#39;s physical description, bank details, travel history, etc. (the “Personally Identifiable Information “PII”). PII can be difficult to manage in networks where correlation between data sources is required. Thus, in some embodiments, the system combines a distributed database to create a framework for a robust security. The system manages the distributed database to associate transactions, or actions, using data, digital signatures, and/or cryptographic keys, which can be unique to an individual.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of, and claims priority to, U.S.patent application Ser. No. 16/869,354, which was filed May 7, 2020,which is a continuation of, and claims priority to, U.S. patentapplication Ser. No. 16/388,746, which was filed Apr. 18, 2019 and nowissued as U.S. Pat. No. 10,678,944, which is a continuation of, andclaims priority to, U.S. patent application Ser. No. 15/480,313, whichwas filed Apr. 5, 2017 and now issued as U.S. Pat. No. 10,311,250, whichclaims priority to U.S. Provisional Patent Application No. 62/318,648,which was filed Apr. 5, 2016. The disclosures of the Patent Applicationsare herein incorporated by reference in their entireties and for allpurposes.

FIELD

The present disclosure relates to computer security, and morespecifically, but not exclusively, to a system and method for personalidentification data management based on, for example, verification andauthentication of the personal identification information.

BACKGROUND

Traditional and generally accepted security measures and common securityinfrastructure, such as passwords, key management software, andtwo-factor authentication approaches have failed to deliver reliable andsecure protection of both the infrastructures they are meant to protect,as well as the individual user's personal data.

The increased number of hacks, attacks, security breaches, successfulfraud attempts, and stolen passwords from end-users—and even entiredatabases from private companies as well as public/governmentorganizations—have led to declining trust from users regardingorganizations that provision their credentials and integrity of thepersonal data that is used to provide user access. Generally, datacompromise generates a lack of confidence in trusting personalidentifiable information to anyone. This increased user fear and concernfor individual data privacy, as well as personal data safety held bythird parties, have led to increased technical challenges fororganizations to maintain and protect the personal identifiableinformation of their users. For example, conventional methods typicallyrequire increased resources to improve data center monitoring andsecurity—including firewalls, secure environments, data breachdetection, penetration testing, resilience exercises against potentialhacks and security breaches.

The main reason for the lack of security in conventional systems is thatoutdated concepts and poor fundamental design is commonly used intechnologies and practices aimed at establishing and protecting identityas well as existing (or a potential user's) personal details. Mostorganizations using these outdated technologies are forced to store anypersonal data collected centrally and store the personal data “asis”—unencrypted. Even when it's encrypted, such data currently can bestolen and used elsewhere for nefarious purposes, due to the singlepoint of compromise in the conventional approaches.

While there are many faults within conventional personal identitymanagement systems, some examples include: storing data in its initialor apparent form; storing data in open form or un-encrypted; storingdata in encrypted form that can easily be restored to their initial oropen form; storing of passwords including digital keys; existence ofbackdoors; not decentralized, “all eggs in one basket” storage; having asingle point of compromise; and conceptually offering any form of“trusted authorities.”

In view of the foregoing, a need exists for an improved system forpersonal identity management in an effort to overcome the aforementionedobstacles and deficiencies of conventional data collection, storage,query, and management systems.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is an exemplary top-level block diagram illustrating oneembodiment of cryptographic data and its partition into sub-componentswithin a storage.

FIG. 2 is an exemplary top-level block diagram illustrating oneembodiment of the cryptographic data of FIG. 1 being stored across aplurality of nodes within a distributed storage.

FIG. 3 is an exemplary detailed functional block diagram illustratingone embodiment of a data transfer process from a client side into thedistributed storage of FIG. 2.

FIG. 4 is an exemplary detailed flow diagram illustrating an embodimentof a data lookup for existence within the distributed storage of FIG. 2.

FIG. 5 is an exemplary detailed block diagram illustrating oneembodiment of the data verification and check process, such as forduplication and prior transactions within the distributed storage ofFIG. 2.

FIG. 6 is an exemplary flow diagram illustrating one embodiment of arecording process and a transaction inside the distributed storage ofFIG. 2.

FIG. 7 is an exemplary flow diagram illustrating one embodiment ofzero-knowledge authorization process.

FIG. 8 is an exemplary flow diagram illustrating one embodiment of anexemplary recording process and transaction of FIG. 6 and includes anevent identifier generation process.

FIG. 9 is an exemplary functional block diagram illustrating oneembodiment of a verification process that can be used with thedistributed storage structure of FIG. 1.

FIG. 10 is an exemplary detailed functional block diagram illustratinganother embodiment of the data transfer process of FIG. 3 wherein thedata partition occurs on a client side.

FIG. 11 is an exemplary detailed flow diagram illustrating anotherembodiment of the data lookup of FIG. 4 wherein the identification datais extracted on the client side.

It should be noted that the figures are not drawn to scale and thatelements of similar structures or functions are generally represented bylike reference numerals for illustrative purposes throughout thefigures. It also should be noted that the figures are only intended tofacilitate the description of the preferred embodiments. The figures donot illustrate every aspect of the described embodiments and do notlimit the scope of the present disclosure.

DETAILED DESCRIPTION

Since currently-available personal identity management systems aredeficient because of outdated data storage and data managementtechniques, a system for personal identity management includingrecording, storing, verifying, authenticating and authorizing ofpersonal identity and its attributes as well as related personalidentifiable information (PII) can prove desirable and provide a basisfor a wide range of data management applications, such as for digitalidentity access to international travel, banking, credit, insurance,medical records, and to prevent fraud or misuse of identity information.This result can be achieved, according to one embodiment disclosedherein, by a personal identity management system 100 as illustrated inFIG. 1. As used herein, personal identity management includes themanagement of any data relating to an individual's identity or personalcredentials that contribute toward that individual's identity, such as,for example, an individual's identity documentation (e.g., a passport,ID card, birth certificate, and so on—including not just the entiredocument or it's identifying unique number but also the individual datafields within them such as date of birth and place of birth), biometricdata (e.g., fingerprints, voice, iris, face, height, eye and haircolor), and other identification data (e.g., employee number,credit/debit cards, access codes, log ins, bookings, etc.).

Turning to FIG. 1, the personal identity management system 100 is shownas including a cryptographic data 110. In a preferred embodiment, thecryptographic data 110 includes data that has been subjected tocryptographic functions such as cryptographic primitives including, butnot limited to one-way hash functions and encryption functions. Thecryptographic data 110 is shown as comprising data sub-parts 111A-M. Itshould be understood that there can be any number of data sub-parts 111comprising the cryptographic data 110. In fact, although shown anddescribed as cryptographic data, the cryptographic data 110 can bepartially subjected to cryptographic primitives or not subjected to itat all. However, the preferred embodiment comprises hashing thecryptographic data 110. By way of another example, the cryptographicdata 110 can include a single sub-part 111, thereby representing thefull data set of the cryptographic data 110, or up to sub-part 111Mthereby including M sub-portions of the cryptographic data 110. In yetanother embodiment, a selected sub-part 111 can overlap with the data inanother sub-part 111. In other words, the same portion of data can bemaintained in two or more separate sub-parts 111. Similarly, sub-parts111 can also contain only unique data from each other. The personalidentity management system 100 is suitable for use with any type ofstorage 112, such as a decentralized distributed storage, including, butnot limited to, for example, a distributed hash table, a distributeddatabase, a peer-to-peer hypermedia distributed storage (e.g.,InterPlanetary File System (IPFS)), a distributed ledger (e.g.,Blockchain), an operating memory, a centralized database, a cloud-basedstorage, and/or the like. In other embodiments, the storage 112 is notdecentralized or comprises a combination of distributed, decentralizedservers, and centralized servers. In even further embodiments, thestorage 112 can be maintained in operating memory of any component inthe system 100. In a preferred embodiment, the storage 112 allocateseach data sub-part 111 to one or more storage nodes 113.

In some embodiments, the system 100 comprises any number of storagenodes 113 as shown on FIG. 2, each having at least one processor and atleast one physical or virtual/cloud-based storage (not shown). Inanother embodiment, the storage nodes 113 can comprise operatingmemory-based storage. In yet another embodiment, the storage nodes 113can have both physical, virtual/cloud-based storage, and an operatingmemory (not shown) to store data.

In a preferred embodiment, a selected storage node 113 does not comprisea complete set of data. For example, as shown in FIG. 2, a selected node113, such as one of the storage nodes 113A, 113B, 113C, or 113N,maintains a fraction of the cryptographic data 110. In the event of asecurity compromise, data stolen from a selected node cannot be used forany meaningful purposes (e.g., human readable) because it represents anincomplete set of the raw data 115 (or only the hashed view, forexample, of the cryptographic data 110). FIG. 2 illustrates a preferredembodiment for partitioning the cryptographic data 110, and stored onone or more storage nodes 113, such as 113A, 113B, 113C, and 113N—acrossthe storage 112, for maximum security. As shown in FIG. 2, the datasub-parts 111A, 111B, 111C, and 111M are stored in one or more storagenodes 113 in the storage 112. In a preferred embodiment, if thecryptographic data 110 includes N sub-parts 111, and there are M storagenodes 113, the number M of storage nodes 113 is greater than the Nsub-parts 111. Although shown and described in FIG. 2 as representingphysical structures, it should be understood that each component of thestorage 112 can virtualize several independent storage nodes 113 as avirtualized system.

In accordance with yet another embodiment, each sub-part 111 can bestored within one or more storage nodes 113 in parallel, to provideintegrity, availability, and partition tolerance for the data. Thiscontributes to a secure infrastructure, where a standalone node cannotbecome a single point of compromise.

In a preferred embodiment, the storage 112 enables adding new data, andprevents changes and/or removals of the data. In an alternativeembodiment, at least one storage node 113 is provided with at least oneprocessor configured to run a set of predefined operations to ensurethat data can only be added.

In some embodiments, as a further security layer, all data transferredbetween a client 114 and the storage 112 can be protected using a secureconnection (e.g., TLS/SSL, cypher, encoding, or any strong cyphertogether with (or without) SSL) such as shown in FIG. 3. Turning now toFIG. 3, one embodiment of an exemplary data transfer process from avariety of sources into the storage 112 is illustrated. As shown, aclient 114 provides raw data 115 to a cryptographic function 116 (e.g.,a cryptographic primitive such as a one-way hash function or anencryption function) to generate the cryptographic data 110. Thecryptographic function 116 can include, for example, secure hashalgorithm (SHA)-2, SHA-3, or any other reliable cryptographically stronghash function. The raw data 115 can be of any nature, any complexity,any size, and of any structure. For example, any binary data, such asdata of 1-byte length (e.g., text file) to a 5 TB video file—can behashed.

The cryptographic data 110 is then partitioned into the sub-parts 111for storage on any number of selected server nodes 113 of the storage112. As used herein, partitioned can include splitting, slicing, and anydivision or decentralization of data. In this preferred embodiment, theraw data 115 advantageously is not transferred through any unsecured (oreven secured) medium between the client 114 and the storage 112.

Although FIG. 3 illustrates the cryptographic data 110 being partitionedin the storage 112, the cryptographic data 110 can also be partitionedinto the sub-parts 111 on the client 114, such as shown in FIG. 10. Asillustrated in FIG. 10, once the raw data 115 is processed through acryptographic function 116, the cryptographic data 110 is partitionedinto the sub-parts 111 prior to being stored in the storage 112. In yetanother embodiment (not shown), the cryptographic data 110 can bepartitioned on a combination of the client 114 and the storage 112.

Advantageously, by processing the raw data 115 through the cryptographicfunction 116 on the client 114, the system 100 does not maintain data inthe open form in the storage 112. Accordingly, it is difficult foranyone to receive or steal personally identifiable data or any othermeaningful data in its original easily accessible form—which is thestandard open form typically used by conventional databases.

Each client 114 can generate a pair of cryptographic keys: a public keyand an associated (large) private cryptographic secret key. In someembodiments, the system 100 can include at least one server-sideprocessor to generate these pairs of keys. In another embodiment, aprocessor of the client 114 is configured to generate these key pairs.Yet another embodiment includes both server- and client-side processorsto generate the pairs. The public key can represent a unique identifierof a selected user. In some embodiments, a secret key can be stored onthe client 114 in a special vault. In an alternative embodiment, secretkeys can be stored within the operating memory of the client 114.

In a preferred embodiment, any form of stored data (e.g., cryptographicdata 110 shown in FIGS. 4 and 11), includes at least one set ofidentification data 118, which allows the system 100 to determineexactly one unique set of personal identifiable information (PII) amongthe entirety of the storage 112. Each set of identification data 118 isassociated with a predetermined level of significance representing thelevel of trust in terms of cross checks and verification. The system 100distinguishes between “knowledge of the data transferred or input” from“verifying or trusting the very same data.” Therefore, initialgeneration of the cryptographic data 110 is treated as unverified, andas the system 100 receives more feedback about cross checks andverification of any data/identity attributes, the predetermined level ofsignificance (or trust level) increases. The higher the significancelevel/assigned level of trust, the more accurate and credible the storeddata becomes within the system 100. As used herein, the dataverification process can also include assigning an aggregated trustscore to any individual data set as discussed herein, as well as anyother flags, warnings, and other markers attached to data points or datasets.

In some embodiments, a combination of a public key along with thespecific data credential sets (which act as identifiers/attributes tocross check within the system 100) are processed through cryptographicprimitives (either on the client 114 or the storage 112) and storedwithin the storage 112, as personal identity data which can be crosschecked for existence and whose attributes can be independently crosschecked and verified.

The user's public key can be used to verify the signature of a user whohas verified some data. It can also be used to verify any other flags,warnings, and other markers attached to data points or data sets as partof the risk-assessment or scoring within the system 100. However, thepublic key is not used to determine the existence of the personal datawithin the storage 112. For example, a selected user can verify theirown personal data as they are in possession of their raw data 115.

Turning to FIG. 4, an exemplary process of determining whether the inputdata exists within the system 100 is shown. The system 100 can determinethe existence of any of the personally identifiable data withoutmaintaining the raw data 115. Each raw data entry 115 that needs to bechecked against existing entries is processed through the cryptographicfunction 116 on the client 114. The cryptographic data 110 is sent tothe storage 112, preferably via a secure connection such as TLS/SSL.FIG. 4 illustrates that the data partition occurs on the storage 112;however, the data partition can also occur on the client 114 andtransmitted to the storage 112 in sub-parts 111 to locate a stored data119 match as shown in FIG. 11.

Returning to FIG. 4, the storage 112 extracts several sets of dataidentifiers 118 from the cryptographic data 110, and uses the dataidentifiers 118 to locate an exact record match as stored data 119 fromthe storage nodes 113. The system 100 then determines whether the rawdata 115 already exists in the storage 112 to check potential errors inany combination of the data sets (e.g., the cryptographic data 110, theraw data 115, the data identifiers 118, and the stored data 119)—basedon comparing and checking credential sets from client side as well asfrom the system 100 storage, such as shown in FIG. 5.

FIG. 5 illustrates an exemplary process for searching for potentialmistakes within the cryptographic data 110 or the identification data118, and determining whether existing records are duplicative of otherentries within the storage 112. As shown, the storage 112 defines errorpatterns 120 that can be cross-referenced and checked against the storeddata 118 and/or the identification data 118. Using the identifiedmistake patterns 121, the storage 112 locates similar records and, ifsuccessful, each new discovered pattern 122 is added to a patternsdatabase (not shown).

In some embodiments, if the required credentials set is presented to thesystem 100, but a data match still cannot be found, the system 100searches for possible errors, for example, by successively excluding onefield (e.g., the data identifiers 118, the error patterns 120, exactrecord match 119, the identified mistake patterns 121, and the newdiscovered pattern 122) after another, via a trainable neural network,or any other decision making process depending on the business logic andpurposes thereof.

By way of example, the process shown in FIG. 5 includes:

1) a client wishing to check the existence of and/or verify data sendsthat cryptographic data 110 to the system 100;

2) the system 100 searches for identifying credential sets in thisdata—to find a unique record in the database (only hashes, no opendata).

Each credential set has its level of significance, for example:

-   -   first name+last name+birthdate+passport no=>max level;    -   first name+birthdate+passport no (without last name)=>max level        minus 1;    -   first name+last name+passport no (without birthdate)=>max level        minus 2;    -   and so on;

3) if the system 100 does not locate any set of credentialscorresponding to the sent data, it returns an error or another responsewhich indicates no data was found;

4) if the system 100 finds at least one credential set, the system 100searches for such credentials in the storage 112;

5) if there is no such data, the system 100 searches for possiblemistakes (e.g., the error patterns 120 and/or new patters 122), byexcluding one field after another and searching for similar data;

Searching for identifying credential sets in this data advantageouslyprovides a high degree of confidence and accuracy—minimizing falsenegatives and maximizing true positives.

In some embodiments, personal identifiable information (PII) coupledfrom various inputs of the raw data 115 or the cryptographic data 110can be used for 1-1 matching, or 1-many matching. Within this context,the system 100 then turns Personally Identifiable Data (PII) on theclient 114 into a cryptographically secure form and then requires 1-1matching accuracy to be maximum in order to guarantee maximalstatistical separation between unique data sets and attributes of anyidentity.

The method described herein allows for the advantages of DNA sequencing,such as providing a high integrity and uniqueness of data preserved tothe highest point of security and individuality, which would give theadvantage of developing a unique digital representation of an individualand their identity attributes much like a Digital DNA.

In some embodiments, the system 100 also encodes data about each datainput, data call, or associated markers for data assessment by theclient 114 onto a distributed ledger, such as the ledger 129 shown inFIG. 6. The system 100 is suitable for use with a wide range of ledgers129, such as any immutable distributed ledger, including, for example, apublic Blockchain (e.g., Bitcoin® Blockchain, Ethereum® Blockchain,etc.) and/or a private Blockchain and/or the like. In some embodiments,the storage 112 could be the same as the ledger 129. In someembodiments, the ledger 129 comprises a combination of public and/orprivate Blockchains. In some embodiments, the system 100 provides thesafety and integrity for multiple amounts of records and events withinthe system 100, all within the parameters of a single ledger transactionon the ledger 129. In some embodiments, each transaction corresponds toa single event within the storage 112. In alternative embodiments, eachtransaction represents a set of events or records within the storage112.

Each new record (or combination of records) of a transaction within thestorage 112 and the client 114 generates a ledger transaction 126 intothe ledger 129 as shown on FIG. 6, which allows anyone to verify andvalidate the existence and accuracy of this data entry. Turning to FIG.6, a preferred embodiment of verification includes analyzing thecryptographic data 110 in combination with a digital signature for theledger transaction 126 that is provided to the ledger 129.Advantageously, anyone can validate the existence of the PII based onthe cryptographic data 110 using the storage 112 and the ledger 129. Insome embodiments, the system 100 can secure several independentcryptographic data 110 within a single ledger transaction 126, withinthe ledger 129 (shown in FIG. 8). With reference to FIG. 6, recordingeach ledger transaction 126 into the ledger 129 and the storage 112 isshown. As shown, the cryptographic data 110 is stored in the storage112, while also being divided into core data 123 and metadata 124. Insome embodiments, metadata 124 is not present within the cryptographicdata 110, so core data 123 is equal to the cryptographic data 110.Metadata 124 can also be derived from external sources (not shown) anddetermined from other variables (e.g., timestamps). Both the core data123 and the metadata 124 can be processed using the cryptographicfunction 116. A record hash 125 is shown as being generated from themetadata 124 and the core data 123. In some embodiments, the record hash125 corresponds to the core data 123 (such as when metadata 124 isempty). The record hash 125 is distributed to the ledger transaction 126as additional information. For example, when the ledger 129 represents aBitcoin® Blockchain, and the ledger transaction 126 represents aBitcoin® Blockchain transaction, the record hash 125 is written into an‘OP_RETURN’ field of the ledger transaction 126. The ledger transaction126 is broadcast over a ledger network 128. As soon as a new block(reflecting the transaction) is created on the ledger 129, the record(s)which the system 100 has placed within the ledger transaction 126 issecured inside the ledger 129 itself. Stated in another way, once theledger transaction 126 is in the block, it is difficult to revert ortamper it, so it is difficult to change its history. A record hash 125is written to the transaction and anyone in possession of the raw data115 can produce the same cryptographic data 110, check its existencewithin the storage 112, and validate/verify information input using theledger 129.

Advantageously, the system 100 doesn't just provide a system ofinformation claims and results, which users are expected to blindlytrust. Instead, the system 100 provides users with an independentverification of the results via the ledger transaction 126 directly,entirely by-passing the suggested system in order for users to check theresults for themselves. As discussed, this independent verificationensures complete transparency in terms of the integrity of the recordsof the system 100 and both the claims and the results which the system100 is able to provide to the requesting clients 114.

Furthermore, in a preferred embodiment, the storage 112 does notmaintain data in its original or open form. In contrast, the raw data115 can be first processed through the cryptographic function 116 on theclient 114 as shown in FIGS. 3 and 10. This is advantageous in thathashed stored data cannot be reverse-engineered back to its originalform in any way, even if a hacker were to obtain access to the data infull hashed view. In some embodiments, the personal identificationmanagement system 100 can have at least one processor on a client-side114 configured to perform cryptography primitives on PII data sets(e.g., the raw data 115 and/or the cryptographic data 110).

Any input into the storage 112 as described above is followed by thegeneration of one or more ledger transactions 126 made in the ledger 129as shown in FIG. 6, to provide a fully secured and trusted way ofimmutable data storage, validation/verification and authentication. Asused herein, immutable applies to the principle that once data has beenwritten to a blockchain, the data is difficult to manipulate, forexample, even for a system administrator.

Each individual user of the system 100, such as a corporate member or arelevant authority can be issued with a (preferably large) cryptographicsecret key (such as modifications 134 of FIG. 7, a private key 144 ofFIG. 9). In some embodiments, the large cryptographic secret key cancomprise a Rivest-Shamir-Adleman (RSA) key, an elliptic curvecryptography (ECC) key, and the like. Due to the known unique featuresof ECC, this large cryptographic secret key can be freely split into anynumber of independent parts (factors). These factors can be of anynature—some examples include, but are not limited to: tokens, passwords,biometric data and pin-codes. Particular embodiments include storingsome parts on a physical memory, such as flash drives. Also, eachcomponent of the secret key can be additionally encrypted to increasethe complexity of its partition. Each factor points to a specificlocation on a single elliptic curve based on the principles of ellipticcryptography (ECC), as an approach to public-key cryptography based onthe algebraic structure of elliptic curves over finite fields. ECCrequires smaller keys compared to non-ECC cryptography (based on plainGalois fields) to provide equivalent security.

In this ECC example, the storing of the parts, which the keys can bebroken up into, can be decentralized and distributed in any number ofstorage nodes 113, such as distributed key management structure andentirely decentralized trust authority, not one central one, and in factthey could be offline or can be not stored nowhere at all. In a furtherembodiment a system comprises an unlimited number of server nodes eachone having at least one processor to perform data encryption/decryptionand client's requests execution. It additionally eliminates thenecessity to store any parts of information relevant to the secret key,particularly there is no need for them to be stored in one place. Thissignificantly decreases the possibility of unauthorized access to thedata/PII and provides higher protection for both individuals andorganizations. In some embodiments, client-side vaults store some partsof a client's private keys. In other embodiments, these portions ofprivate keys are not kept and can be requested from a client with eachrequest to a servers. Another embodiment can include a client-sideprocessor to obtain and combine all parts of the secret key from aclient before interacting with a server nodes.

In a preferred embodiment, the system 100 overcomes limitations oftypical conventional systems:

1) It is difficult to store anything meaningful within conventionalledgers, for example, because conventional ledgers, by design, are not asuitable storage solution, and normally there is a limited length of therare fields within which any independent recording can be possible;furthermore, such ledgers face a limitation connected with the speed ofcreating and reading records placed within them—connected to thelimitations of timing in block creation for such ledgers;

2) It is also difficult to fully protect anything meaningful within anyconventional data storage such as relational databases, data warehouses,and so on.

Therefore, the storage 112 of the system 100 is in fact protected withinthe ledger 129 itself for security through the immutable ledgerprotocols.

Thereafter, any record within the storage 112 can bechecked/validated/queried/verified in a decentralized and independentmanner by any of the parties who already are in possession of the rawdata 115 and are trying to check it for existence. Without any knowledgeof the raw data 115, nothing can be checked and therefore can't behacked/stolen by potential attackers. One embodiment of thecheck/validation process is shown in FIG. 9.

System 100 advantageously considers the need to store zero personallyidentifiable data (which by itself, embodies the very concept of privacyby design). In addition, system 100 checks the personally identifiabledata and, specifically, in a manner whereby these checks (including butnot limited to verifications, flags, warnings, etc.) are recorded insuch a way that it would be impossible to fake or adjust, all the whilenot storing any of the raw data 115 within the system 100.

Furthermore, the system 100 stores neither initial personallyidentifiable data, nor information about verifications of the personallyidentifiable data within any traditional storage per se. In order toachieve this, and instead of storing anything that is able to be reverseengineered (or human readable), the system 100 duplicates the results ofeach verification into—both the storage 112 as well as into the ledger129 as shown on FIG. 6. Were it that either the initial data or thatit's verification was actually ‘stored’ elsewhere or in a current methodof ‘all eggs in one basket’, both of the data and it's verifiers couldbe easily hacked or faked.

Moreover, since the protocols of the ledger 129 (especially publicledgers) are strict and do not allow records to be made by just anyonein any form, as well as the fact that the speed of recording in theledger 129 is limited, the system 100 can negate both of these negativeprocesses by using hash trees, as well as not storing any of the rawdata 115 as shown in FIG. 8.

The advantage of the system 100 is thus in the difficulty of a) notstoring any of the raw data 115 b) not storing any cryptographic data110 in its raw, original form (which can also be de-crypted or hackedand reconstructed for meaning and potential maluse c) system 100 workswith hashed and, in some embodiments split cryptographic data 110, whicheven in the case of being hacked, would be impossible to restore back toits initial form of raw data 115, or cryptographic data 110—which isdata defense through mathematics. Therefore, when the cryptographic data110 is not stored in any original non-partitioned form in the preferredembodiment, the system 100 is protected from well-known attacks onhashes, such as through brute-force, rainbow attacks, and so on.

Moreover, each result of each verification is encoded within the ledger129 and protected within the ledger 129. The immutable ledger 129protects the exact record as encoded by system 100, because once theledger block containing this record has been generated, andbroadcasted/propagated to the network, it is difficult to change.

Therefore, what system 100 does keep a record of is component parts ofutterly useless cryptographic data 110 sub-parts 111 as shown in FIG. 1,which no one would benefit from hacking in any way; it also stores aduplicate record of the ledger transaction 126 as shown in FIG. 6 ofeach verification within the system 100, whereby the duplicate encodedon the ledger 129 becomes both immutable and publically available for anexistence-check (provided that the existence-check is being performed bysomeone already in possession of the raw data 115, one cannot check datawhich one has no access to); the advantage of this process of the system100 is that the cross-check becomes a decentralized process, foregoingthe system 100 as a mediator, the cross-check is thus independent.

Advantageously, while not storing any raw data 115 (or even “raw” hashessuch as cryptographic data 110), the ability of system 100 is maintainedin that it can provide confirmation as to whether it exists or doesn't;whether there are any errors within it/whether there are similaritieswith other existing data (existing data meaning open form data, notreferring to any data within system 100 since it stores no data in open,original form);

Simultaneously, any reference to system 100 verifications are held in anincomprehensible form for any attacker; and does not allow for anyforgery of either the verification itself or the history of said dataverification. This therefore can guarantee the ability for externalcross-checks of any verifications which can fully by-passes system 100directly via the parallel records of that verification which were madeat the time when it was duplicated onto the immutable ledger 129 (thusnegating the issue that any verifications within system 100 could behacked or falsified—if the verifications don't compare 1 to 1, thesystem 100 does not accept the verification).

Unlike ledgers, individual identity sets or their attributes are notbinary. In order to create the above advantages, the system 100 is basedon an adaptive strategy to distil and arrange the infrastructure of PII.Since, unlike public Blockchains—wherein identity sets are not binary,but instead can contain many attributes and moving/changing parts, thesystem 100 provides the complexity of both storing zero meaningful datawhile also providing verifications and duplicating them into theimmutable ledger 129.

The authorization process is a zero-knowledge proof, based on strongelliptic curve cryptography, and a challenge-response protocol forverification of possession of this secret key.

An example authorization process is shown in FIG. 7 as a sequencediagram of zero-knowledge authorization. As shown, a client 130 sends aspecial identity request 132 into an Authenticator 131. Theauthenticator 131 responds to the client with a challenge 133 of arandom big number, as per RSA Factoring Challenge in encryptionprocesses. The client 130 then makes the necessary modifications 134 ofthis big number using private/secret key to the client 130 and sends thenew, modified big number 135 back to the authenticator 131. Theauthenticator 131 checks the modified big number from the client 114 andresponds with the result 136 as to whether the challenge-response wascorrect.

To add a check or a verification to any raw data 115, a verificationauthority or a client must create the cryptographic data 110—this can beaccomplished by using, for example, the cryptographic function 116—andthen the authority/client can create a digital signature based on thecryptographic data 110 using it's own secret key. Information about thischeck or verification is also stored in the storage 112. Like any othertransaction within the system 100, information about this verificationis also secured in a ledger 129 and is accessible and availablepublicly. The authorization process disclosed herein is a zero-knowledgeproof of possession of this secret key. According to some embodiments, aselected storage node 113 includes at least one processor to perform theauthorization process based on a zero-knowledge proof of work.

Any stored or transferred data must contain at least one set ofcredentials such as data identification 118 and/or identity data 132.This allows the system 100 to determine exactly one match to a data setor to one set of credentials based on the comparison of data within thestorage 112.

Each set of credentials has its own level of significance for performinga data search within the system. For example:

-   -   first name+last name+birthdate+passport no=>max level;    -   first name+birthdate+passport number (without last name)=>max        level minus 1;    -   first name+last name+passport number (without birthdate)=>max        level minus 2;    -   and so on;

The identification process (as opposed to the authorization/verificationprocess) of a user identity the attributes of personally identifiableinformation (PII) can thus be reduced to a successful query within thestorage 112, where a full set of the raw data 115 has been processedthrough the cryptographic function 116. In this process of providing thecryptographic data 110 for identification within the system 100, atleast one set of the user's unique credentials must be included.

In order to provide both trust and security needed to solve the issuesof managing personal information, and in order to eliminate thepossibility of any hacks or fraud, each and every data/PII/identitytransaction that occurs within the storage 112 is recorded in the ledger129, which stores within itself all the existing records of transactionsever made. In some embodiments, the system 100 can have a processorconfigured to produce transactions into the ledger 129 immediately afterany operation is performed with data within the system 100.

An example of a detailed process of recording information into ledger129 and how an Event Identifier 141 is created is shown in FIG. 8. Acryptographic data 110 is hashed again using a cryptographic function116—creating a Record Hash 125. Several of these Record Hashes 125 canbe placed inside a new Block of Records Hashes 137. Once the block 137is full, the Block of Records Hashes is hashed again, using acryptographic function 116—creating a Block Hash 138. In someembodiments, each block 137 can contain a single record hash 125, orBlock Hash 138 could be equal to the record hash 125 itself. In otherembodiments, the record hash 125 could be equal to the cryptographicdata 110 and the Block Hash 138 could be equal to the cryptographic data110. In other words, the cryptographic data 110 could be used withoutadditional hashing for further steps and could be viewed as Block Hashes138. In some other embodiments, any number of hashing rounds could beapplied to any of the steps producing record hashes 125 and Block Hashes138. Several of these Block Hashes 138 are then placed inside of a tree139 with a root—creating a tree Root Hash 140. The Root Hash 140 is thenplaced into the ledger transaction 126. For a Bitcoin Blockchain, forexample, the ledger transaction 126 could be achieved by adding RootHash 140 to an ‘OP_RETURN’ field of the ledger transaction 126. Thatsame ledger transaction 126 is then broadcast out onto the network 128,and the system generates a transaction identifier within the ledger129—creating a Transaction ID 142. Thereafter, the Record Hash 125, theBlock Hash 138, the tree Root Hash 140 and the transaction ID 142 areall used to generate an Event Identifier 141, and as soon as a new blockon the ledger 129 is created, this Event Identifier record 141 issecured inside of the ledger 129.

As demonstrated in FIG. 9, a Verificator 143 comprises the cryptographicfunction 116 to generate the cryptographic data 110 from the PersonalIdentifiable Information (PII) 147; the party verifying this data setuses a private key 144 (or secret key) in order to generate a digitalsignature 145 which is layered over the cryptographic data 110 that isbeing verified. In a preferred embodiment, the private key 144 is uniquefor each user etc.

The resulting unique verification information 146, which includes boththe cryptographic data 110 (generated from the raw data 115) as well asthe digital signature 145 (generated from the party that is verifyingthis raw data 115) is then stored within the storage 112 as well aswithin the ledger 129—both of these storages are thus performedsimultaneously or in parallel.

Any client 130 who may wish to check the prior existence of PII 147 (orany raw data 115) as well as it's veracity and any associated attributesor verifications about the PII 147, the raw data 115, the cryptographicdata 110, and so on, will also use the same cryptographic function 116,in order to generate the cryptographic version 110 of PersonalIdentifiable Information (PII) 147 and then send this cryptographic data110 version of the raw data 115 into the storage 112, in order toperform a cross check of both its prior existence within the system 100,as well as any relevant verification information 146 in connection withthe original set of PII 147.

The advantage of this process is that, should this verificationinformation 146 already exists within the storage 112 of the system 100,then any client 130 who may wish to check it may perform an independentcheck directly on the available records within the immutable ledger 129(which should match those within storage 112).

With respect to ECC, a large cryptographic secret key can be issued forevery client, and can be un-restrictively split into any number ofindependent factors, due to the unique features of pair-based ellipticcurve cryptography. These factors can be of any nature—some examplesinclude, but are not limited to: tokens, passwords, biometric data orpin-codes.

Similarly with respect to ECC, each factor—or share—of such a secret keycan be additionally encrypted to dramatically increase the difficulty ofhacking it. Each part of the multi-step authentication process points ona specific location of a single elliptic curve. Moreover, storing ofthese fractions is decentralized and distributed in any number of nodes,and in fact does not have to be recorded anywhere at all.

These keys are never exchanged between clients 114 of the system 100.The need for any information related to part(s) of the secret/privatekey to be stored in one place is reduced, which significantly decreasesany possibility of unauthorized access to personal data. Accordingly,any node compromise does not reveal any sensitive or usable informationto potential attackers at any one point, significantly minimizingvulnerability down to making it near-impossible.

The systems and methods disclosed herein may be used in many differentcontexts in which Identity verification or access management isrequired, such as applications for external uses, including:

Online services, including dating/professional service providers,whereby individuals interact in the digital as well as the physicalworld—with an emphasis on name and age verification, background checks.

Employment—verification of work permits and entrydocumentation/immigration, as well as associated background checks onindividual identity and their attributes.

Adult Entertainment—Age verification, payment verification, frauddetection.

Gambling—Age verification, payment verification, fraud detection,previous user history and associated credit checks.

Immigration and cross-border movement of individuals—identity documentschecks, background checks and paperwork validity, citizenship andpermits to travel, validating claims of identity and identityattributes.

Fintech—digital banking security, transaction security, identity claimsfor financial fraud and access to funds or financial services, clearanceand compliance activity.

Debit/Credit Cards—Anti-money laundering (AML), fraud detection,transaction security, clearance verifications and card replacementauthentication.

Credit referencing and rating agencies—assurance of identity, fraud,previous behavior history, risk-based assessments.

National and International Travel—identity checks for country ofdestinations and their border authority, no fly lists, Interpol,politically exposed persons (PEP) lists, relevant law enforcement andgovernment authorities, border control agencies, airport infrastructure,security and customs.

Airline security, airline know your customer (KYC) processes, passengeridentification and risk assessment, inter-airline passenger behaviorhistory, flight manifest verification, passport and visa checks,passport verification, identity document verification, booking dataverification and accuracy checks (including online and mobile booking),fraud detection for payment, fraud detection for loyalty program claimsand abuses, identity claim verification, advanced passenger informationsystems (APIS) verification and passenger reputational scoring.

Data Entry—Correcting human error, automating correct entry process(e.g., Companies House data input (which is currently manual),International travel passenger data input, Credit referencing and ratingagencies—all of this is manual, subject to human error and potentiallack of attention to detail/quality staff training/impossibility ofcatching an error (for example, one as minute as a zero instead of theletter ‘O’).

Insurance—delayed flight insurance, credit card fraud insurance,mortgage insurance, payment default insurance. Risk assessment forinsurance premiums calculations, as well as trust score used for premiumpayouts and claims assessments.

Government Services—taxation, pensions, income declaration, revenue andcustoms assessments, tax evasion, etc.

National and International Individual Identity—documentation for carhire, real estate, medical services, and the need to verify both itsveracity and validity as well as assert ownership, or a transfer ofownership

Legal records—verifying the existence of and veracity of claimed legallyrecorded proceedings and documentation, verifying their source and theindividual to whom they pertain

Fraud protection—decentralized automated and client-controlledmonitoring for fraud activities and unusual patterns in identity use orbehavior, aggregate risk assessment, fraud detection and prevention

Need To Know Basis—permission—based Document exposure: similarly, thepresent system and method may be advantageously used to allow users whoare members of a pre-defined group entering a closed system. This wouldinclude, for example, all employees of a company accessing thatcompany's private network, or only those employees having specifiedsecurity clearances accessing particular environments or documents inthe private network.

Biotech: medical records, patient registry, administering correcttreatment to the correct patient, drug development based on theindividual's biometric data, verification of medical notes and theirsource, right to access medical help.

Compliance with new privacy laws such as general data protectionregulations (GDPR), and Privacy by Design

The right to be forgotten (e.g., erasing or removing PII data)

The right to privacy

The disclosed embodiments are susceptible to various modifications andalternative forms, and specific examples thereof have been shown by wayof example in the drawings and are herein described in detail. It shouldbe understood, however, that the disclosed embodiments are not to belimited to the particular forms or methods disclosed, but to thecontrary, the disclosed embodiments are to cover all modifications,equivalents, and alternatives.

What is claimed is:
 1. A method for managing personal identifiableinformation within independent computer systems and digital networks,the method comprising: processing raw data representing the personalidentifiable information on a client device through one or morecryptographic primitives to generate cryptographic data representing thepersonal identifiable information; partitioning the cryptographic datainto one or more sub-parts; storing each partitioned sub-part in one ormore storage nodes across a storage, wherein the raw data is nevertransferred or stored through any unsecured medium between the clientand the storage; generating one or more ledger transactions for storagein a ledger database, the one or more ledger transactions beingassociated with said storing of each partitioned sub-part in the one ormore storage nodes, and each ledger transaction comprising a record hashof core data and metadata, the core data and metadata being extractedfrom the cryptographic data; and verifying the raw data by generating apair of cryptographic keys comprising a public key and an associatedprivate key and creating a digital signature of the cryptographic data.2. The method of claim 1, wherein said generating the ledger transactioncomprises dividing the cryptographic data into core data and metadata,and generating a record hash from the core data and the metadata forstorage in the distributed ledger.
 3. The method of claim 1, whereinsaid processing the received raw data comprises processing the raw datathrough a one-way hash function.
 4. The method of claim 1, furthercomprising receiving queried cryptographic data for searching for datamatches; and searching the storage nodes for the queried cryptographicdata.
 5. The method of claim 1, wherein said searching further comprisesextracting a set of data identifiers from the queried cryptographicdata, and determining whether an existing record is maintained in thestorage for the raw data associated with the cryptographic data based onthe extracted set of data identifiers.
 6. The method of claim 1, whereinsaid partitioning occurs on the client device.
 7. The method of claim 1,wherein said partitioning occurs on a remote data management server. 8.The method of claim 1, wherein said storing comprises storing eachpartitioned sub-part in the one or more storage nodes across the storagethat is disposed on a remote server.
 9. The method of claim 1, whereinsaid extracted set of data identifiers is associated with apredetermined level of significance.
 10. The method of claim 1, furthercomprising authorization of the cryptographic data based on an ellipticcurve cryptography of the private key.
 11. A personal identifiableinformation system for data management and authentication withinindependent computer systems and digital networks, the systemcomprising: a client device for receiving raw data representing thepersonal identifiable information and comprising a cryptographicprocessor for processing the raw data through a cryptographic functionto generate cryptographic data; a storage system communicatively coupledwith the client device and comprising one or more storage nodes on aremote server, the storage system being unique from the client device,the one or more storage nodes for receiving and storing sub-portions ofthe cryptographic data, and wherein the raw data is never stored on thestorage system; and a ledger database for storage of one or more ledgertransactions associated with any event associated with the storagesystem, wherein the ledger database comprises at least one of a publicblockchain and a private blockchain.
 12. The system of claim 11, whereineach ledger transaction comprises a record hash of core data andmetadata, the core data and metadata being extracted from thecryptographic data.
 13. The system of claim 11, wherein said storagecomprises at least one of a distributed database, a distributed hashtable, a peer-to-peer hypermedia distributed storage, an operatingmemory, a centralized database, and a cloud-based storage.
 14. Thesystem of claim 11, wherein a selected storage node maintains only aunique sub-portion of the complete set of cryptographic data.
 15. Thesystem of claim 14, wherein a total number of sub-portions of thecryptographic data is less than a total number of storage nodes.
 16. Thesystem of claim 11, wherein the cryptographic function comprises atleast one of a secure hash algorithm-2 and a secure hash algorithm-3.17. The system of claim 12, wherein said client device further generatesa pair of cryptographic keys comprising a public key and an associatedprivate key.